perf: parallel generate, speed up 10% #1001

xusd320 · 2024-03-29T06:29:38Z

Close #976

改动点：

generate 阶段将 entry chunk 提前到和 normal chunk 并行执行。entry 中如果需要记录子 chunk 的 hash 的话，先用 nanoid 生成一个 8 位的随机 id 作为 placeholder，在所有 normal chunk generate 成后，再将 placeholder 替换为真实文件 hash ；
添加 chunk hash replace 的 e2e 用例；
thread_pool 暴漏 rayon 的 join 方法，用于并行执行两个 op；
ChunkPot::split_modules 没必要返回 result 类型，顺手改了；

验证性能提升：
yuyanAssets 提升 10%

当前分支:

master:

Summary by CodeRabbit

New Features
- Added functionality to generate chunk files with hashed names.
- Introduced the ability to set the mode to "production" and enable hashing for generated chunks.
- Added a new public function to the thread pool module for concurrent execution.
- Validated correct hash value replacement in build results.
Refactor
- Refactored chunk file generation logic.
- Reorganized chunk file processing and replacement logic.
Documentation
- Updated mako.config.json to include new configuration options.
Style
- Set font sizes in CSS files for better styling.

coderabbitai · 2024-03-29T06:29:47Z

Walkthrough

The changes introduce new dependencies, refactor chunk file generation logic, enhance file processing, and add functionality for concurrent operations. Configuration options for production mode and chunk hashing are now available.

Changes

File	Summary
crates/mako/Cargo.toml	Added `nanoid` dependency version `0.4.0`.
crates/mako/src/chunk_pot.rs & util.rs	Publicized `stylesheet` field, adjusted return types, and made certain items public. Updated `file_content_hash` to truncate hash to a specified length.
crates/mako/src/generate_chunks.rs	Refactored chunk file generation, introduced hashing, and reorganized processing logic.
crates/mako/src/thread_pool.rs	Added `join` function for concurrent execution of closures.
e2e/fixtures/generate.chunks-hash-replace/expect.js	New file for validating hash value replacements.
e2e/fixtures/generate.chunks-hash-replace/mako.config.json	Introduces production mode and chunk hashing configuration.
e2e/fixtures/generate.chunks-hash-replace/src/*.css/.ts	Added new CSS and TypeScript files with styling and module imports.

Poem

A rabbit hops, code changes unfold,
Dependencies new, logic bold.
Hashes truncate, threads unite,
Files styled, in code's dance, they write. 🐇✨

Note

Pull Request Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://coderabbit.ai

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

coderabbitai

Actionable comments posted: 1

Review Status

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between d1fdb1a and 811c976.

Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

Files selected for processing (11)

crates/mako/Cargo.toml (1 hunks)
crates/mako/src/chunk_pot.rs (7 hunks)
crates/mako/src/chunk_pot/util.rs (1 hunks)
crates/mako/src/generate_chunks.rs (2 hunks)
crates/mako/src/thread_pool.rs (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/expect.js (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/mako.config.json (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/index.css (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/index.ts (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/lazy.css (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/lazy.ts (1 hunks)

Files skipped from review due to trivial changes (1)

e2e/fixtures/generate.chunks-hash-replace/src/index.css

Additional Context Used

Additional comments not posted (14)

e2e/fixtures/generate.chunks-hash-replace/src/lazy.ts (1)

1-2: The changes in lazy.ts are straightforward and correct. It properly imports a CSS file and exports a string value.

e2e/fixtures/generate.chunks-hash-replace/mako.config.json (1)

1-4: The mako.config.json file is correctly formatted and the configuration settings align with the PR's objectives of optimizing the build process.

e2e/fixtures/generate.chunks-hash-replace/src/lazy.css (1)

1-5: The CSS changes in lazy.css are syntactically correct and follow standard CSS practices.

e2e/fixtures/generate.chunks-hash-replace/src/index.ts (1)

1-3: The changes in index.ts are correct, including the dynamic import of the lazy module, which aligns with modern JavaScript practices for lazy-loading.

e2e/fixtures/generate.chunks-hash-replace/expect.js (1)

1-18: The assertions in expect.js are correctly implemented to test the chunk hash replacement functionality, aligning with the PR's objectives of ensuring the integrity of the new parallel generation mechanism.

crates/mako/src/thread_pool.rs (1)

29-39: The addition of the join function in thread_pool.rs is correctly implemented to facilitate parallel execution of operations using the Rayon thread pool, adhering to Rust's concurrency best practices.

crates/mako/Cargo.toml (1)

21-21: The addition of the nanoid dependency with version 0.4.0 in Cargo.toml is correctly specified and aligns with the PR's requirements for generating temporary placeholders for chunk hashes.

crates/mako/src/chunk_pot.rs (2)

28-28: Making the stylesheet field public in the ChunkPot struct aligns with the PR's objectives of enhancing chunk processing.

186-200: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [133-197]

The adjustments to return types and making the CssModules struct and get_css_chunk_filename function public are correctly implemented, facilitating the new functionality introduced in the PR.

crates/mako/src/chunk_pot/util.rs (1)

307-313: The introduction of CHUNK_FILE_NAME_HASH_LENGTH to standardize hash lengths across the application is a good practice. However, it's crucial to ensure that the chosen length of 8 characters strikes a balance between uniqueness and brevity to minimize the risk of hash collisions. Consider analyzing the potential for collisions with the expected volume of files and adjusting the length if necessary.

crates/mako/src/generate_chunks.rs (4)

11-11: The use of nanoid for generating placeholders and the introduction of ChunksHashPlaceholder and ChunksHashReplacer are well-considered choices for managing the complexity of parallel chunk generation. Ensure to conduct performance testing to assess the impact of nanoid generation and analyze edge cases, such as the unlikely but possible generation of identical placeholders for different chunks.

Also applies to: 55-56

64-74: The parallel generation of entry and normal chunk files using thread_pool::join is a significant improvement in the build process. Ensure thorough testing is conducted to identify any potential data races or concurrency issues, especially considering the shared state that might be accessed by these parallel tasks.

150-156: The handling of CSS filenames and placeholders within the chunk generation process is well-integrated. Ensure that the naming and replacement logic for CSS files is reviewed for consistency and efficiency, particularly considering the unique requirements and conventions for CSS files compared to JS chunks.

Also applies to: 158-161

285-285: The hash_file_name function is crucial for generating hashed filenames. Ensure that edge cases, such as files without extensions or with multiple dots in their names, are adequately tested to guarantee the function's robustness in all scenarios.

crates/mako/src/generate_chunks.rs

coderabbitai

Actionable comments posted: 2

Review Status

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 811c976 and d8cd5bc.

Files selected for processing (1)

crates/mako/src/generate_chunks.rs (2 hunks)

Additional Context Used

Additional comments not posted (5)

crates/mako/src/generate_chunks.rs (5)

11-11: The addition of nanoid::nanoid aligns with the PR's objective to use temporary placeholders for chunk hashes. This is a lightweight and efficient way to generate unique IDs, suitable for the intended use case.

55-56: The introduction of ChunksHashPlaceholder and ChunksHashReplacer maps is a good design choice. It clearly separates the concerns of holding placeholders for chunk hashes and their eventual replacement values, contributing to code readability and maintainability.

72-114: The parallel generation of entry and normal chunk files using thread_pool::join is a significant improvement. It leverages concurrency to enhance performance, directly addressing the PR's objective. However, ensure that the thread pool is appropriately sized to avoid potential overhead from too many threads competing for resources.

81-114: The logic for replacing chunk hash placeholders with actual hashes is well-implemented. It's crucial, however, to ensure that the replacement operation does not introduce any unintended modifications to the chunk files. Consider adding more detailed tests to cover edge cases, such as placeholders appearing in unexpected parts of the chunk content.

255-298: The replace_chunks_placeholder function is crucial for the new chunk generation logic. It replaces placeholders in chunk files with actual hashes. The error handling here is specific and informative, which is good practice. However, consider the performance implications of iterating over the content for each placeholder. For large files or a high number of placeholders, this could become a bottleneck. Investigating more efficient string replacement techniques or ensuring that the number of placeholders is kept minimal could mitigate potential performance issues.

coderabbitai

Actionable comments posted: 2

Review Status

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between d1fdb1a and d6a0632.

Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

Files selected for processing (11)

crates/mako/Cargo.toml (1 hunks)
crates/mako/src/chunk_pot.rs (7 hunks)
crates/mako/src/chunk_pot/util.rs (1 hunks)
crates/mako/src/generate_chunks.rs (2 hunks)
crates/mako/src/thread_pool.rs (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/expect.js (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/mako.config.json (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/index.css (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/index.ts (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/lazy.css (1 hunks)
e2e/fixtures/generate.chunks-hash-replace/src/lazy.ts (1 hunks)

Files skipped from review as they are similar to previous changes (10)

crates/mako/Cargo.toml
crates/mako/src/chunk_pot.rs
crates/mako/src/chunk_pot/util.rs
crates/mako/src/thread_pool.rs
e2e/fixtures/generate.chunks-hash-replace/expect.js
e2e/fixtures/generate.chunks-hash-replace/mako.config.json
e2e/fixtures/generate.chunks-hash-replace/src/index.css
e2e/fixtures/generate.chunks-hash-replace/src/index.ts
e2e/fixtures/generate.chunks-hash-replace/src/lazy.css
e2e/fixtures/generate.chunks-hash-replace/src/lazy.ts

Additional Context Used

Additional comments not posted (5)

crates/mako/src/generate_chunks.rs (5)

11-11: The addition of nanoid as a dependency for generating unique placeholders is a good choice due to its efficiency and minimal collision risk. Ensure that the version 0.4.0 is compatible with other dependencies and the project's overall ecosystem.

55-56: Introducing ChunksHashPlaceholder and ChunksHashReplacer as HashMap types is a clear and efficient way to manage the mapping between chunk IDs and their respective placeholders or actual hash values. This approach supports the parallel generation and replacement logic effectively.

72-115: The parallel generation of entry and normal chunk files using thread_pool::join is a significant improvement. It aligns with the PR's objective to enhance performance through concurrency. However, ensure that the thread pool is adequately sized to prevent potential bottlenecks or excessive resource usage.

81-115: The logic for replacing placeholders in entry chunk files with actual hash values is well-implemented. It's crucial to ensure that this replacement process is thoroughly tested, especially in edge cases where placeholders might not be found or replaced correctly.

256-299: The replace_chunks_placeholder function is crucial for the new parallel generation logic, ensuring that placeholders in chunk files are replaced with actual hash values. It's important to handle errors gracefully and provide clear messages for debugging. Consider adding more detailed logging or metrics around this process to help identify and troubleshoot issues in production environments.

crates/mako/src/generate_chunks.rs

coderabbitai

Actionable comments posted: 2

Review Status

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between d6a0632 and d054e50.

Files selected for processing (1)

crates/mako/src/generate_chunks.rs (2 hunks)

Additional Context Used

Additional comments not posted (5)

crates/mako/src/generate_chunks.rs (5)

11-11: Importing nanoid aligns with the PR objectives for efficient chunk hashing. Ensure that the nanoid version (v0.4.0) is compatible with the project's other dependencies to avoid conflicts.

55-56: The introduction of ChunksHashPlaceholder and ChunksHashReplacer types is a good practice for managing chunk hash placeholders and their replacements. This enhances code readability and maintainability.

72-115: The parallel generation of entry and normal chunk files using thread_pool::join is a significant improvement. It aligns with the PR's goal of speeding up the build process. However, ensure that the error handling within the parallel execution paths is robust and that any potential panics are gracefully handled.

81-115: The logic for replacing chunk hash placeholders in entry chunk files is well-implemented. However, consider adding more detailed logging or metrics around the placeholder replacement process to aid in debugging and performance monitoring.

256-299: The replace_chunks_placeholder function is crucial for the dynamic replacement of chunk hash placeholders. It's well-implemented, but consider adding error logging for each specific failure case to aid in troubleshooting. Additionally, ensure that the performance impact of iterating over chunk files and performing replacements is minimal, especially for large builds.

crates/mako/src/generate_chunks.rs

stormslowly

可以记一个 action，看下 split_modules 在 entry chunk 中重复执行的一次开销大不大。
如果需要优化的话 generate_chunk_files 里面实现逻辑调整成
chunk -> chunk_pot -> chunk_files 这样的一个流程
在 chunk_pot 阶段，配合 module graph 有能算出各个 entry 的 js_map 和 css_map

xusd320 requested review from sorrycc, stormslowly and PeachScript March 29, 2024 06:29

coderabbitai bot reviewed Mar 29, 2024

View reviewed changes

crates/mako/src/generate_chunks.rs Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 29, 2024

View reviewed changes

perf: parallel generate

d6a0632

xusd320 changed the title ~~perf: parallel generate, speed up 10%~~ [WIP] perf: parallel generate, speed up 10% Mar 29, 2024

xusd320 force-pushed the perf/parallel-generate branch from d8cd5bc to d6a0632 Compare March 29, 2024 11:24

coderabbitai bot reviewed Mar 29, 2024

View reviewed changes

crates/mako/src/generate_chunks.rs Outdated Show resolved Hide resolved

crates/mako/src/generate_chunks.rs Outdated Show resolved Hide resolved

chore: improve error handling

d054e50

xusd320 changed the title ~~[WIP] perf: parallel generate, speed up 10%~~ perf: parallel generate, speed up 10% Mar 29, 2024

coderabbitai bot reviewed Mar 29, 2024

View reviewed changes

crates/mako/src/generate_chunks.rs Outdated Show resolved Hide resolved

crates/mako/src/generate_chunks.rs Outdated Show resolved Hide resolved

xusd320 changed the title ~~perf: parallel generate, speed up 10%~~ WIP perf: parallel generate, speed up 10% Mar 29, 2024

chore: improve error handling

f63d47c

xusd320 changed the title ~~WIP perf: parallel generate, speed up 10%~~ perf: parallel generate, speed up 10% Apr 1, 2024

xusd320 added 2 commits April 1, 2024 16:38

fix: e2e generate.chunks-hash-replace

4342c62

chore: rename variables

0041570

stormslowly approved these changes Apr 1, 2024

View reviewed changes

chore: add some todo to be optimized later

636dd7a

sorrycc approved these changes Apr 3, 2024

View reviewed changes

sorrycc merged commit b106e28 into master Apr 3, 2024
7 of 8 checks passed

delete-merged-branch bot deleted the perf/parallel-generate branch April 3, 2024 01:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: parallel generate, speed up 10% #1001

perf: parallel generate, speed up 10% #1001

xusd320 commented Mar 29, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 29, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

CodeRabbit Discord Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

stormslowly left a comment

perf: parallel generate, speed up 10% #1001

perf: parallel generate, speed up 10% #1001

Conversation

xusd320 commented Mar 29, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Mar 29, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

stormslowly left a comment

Choose a reason for hiding this comment

xusd320 commented Mar 29, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 29, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)